379 research outputs found
RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers
Short TCP flows that are critical for many interactive applications in data
centers are plagued by large flows and head-of-line blocking in switches.
Hash-based load balancing schemes such as ECMP aggravate the matter and result
in long-tailed flow completion times (FCT). Previous work on reducing FCT
usually requires custom switch hardware and/or protocol changes. We propose
RepFlow, a simple yet practically effective approach that replicates each short
flow to reduce the completion times, without any change to switches or host
kernels. With ECMP the original and replicated flows traverse distinct paths
with different congestion levels, thereby reducing the probability of having
long queueing delay. We develop a simple analytical model to demonstrate the
potential improvement of RepFlow. Extensive NS-3 simulations and Mininet
implementation show that RepFlow provides 50%--70% speedup in both mean and
99-th percentile FCT for all loads, and offers near-optimal FCT when used with
DCTCP.Comment: To appear in IEEE INFOCOM 201
Dominant Resource Fairness in Cloud Computing Systems with Heterogeneous Servers
We study the multi-resource allocation problem in cloud computing systems
where the resource pool is constructed from a large number of heterogeneous
servers, representing different points in the configuration space of resources
such as processing, memory, and storage. We design a multi-resource allocation
mechanism, called DRFH, that generalizes the notion of Dominant Resource
Fairness (DRF) from a single server to multiple heterogeneous servers. DRFH
provides a number of highly desirable properties. With DRFH, no user prefers
the allocation of another user; no one can improve its allocation without
decreasing that of the others; and more importantly, no user has an incentive
to lie about its resource demand. As a direct application, we design a simple
heuristic that implements DRFH in real-world systems. Large-scale simulations
driven by Google cluster traces show that DRFH significantly outperforms the
traditional slot-based scheduler, leading to much higher resource utilization
with substantially shorter job completion times
Nuclei: GPU-Accelerated Many-Core Network Coding
Abstract—While it is a well known result that network coding achieves optimal flow rates in multicast sessions, its potential for practical use has remained to be a question, due to its high computational complexity. Our previous work has attempted to design a hardware-accelerated and multi-threaded implementation of network coding to fully utilize multi-core CPUs, as well as SSE2 and AltiVec SIMD vector instructions on x86 and PowerPC processors. This paper represents another step forward, and presents the first attempt in the literature to maximize the performance of network coding by taking advantage of not only multi-core CPUs, but also potentially hundreds of computing cores in commodity off-the-shelf Graphics Processing Units (GPU). With GPU computing gaining momentum as a result of increased hardware capabilities and improved programmability, our work shows how the GPU, with a design involving thousands of lightweight threads, can boost network coding performance significantly. Many-core GPUs can be deployed as an attractive alternative and complementary solution to multi-core servers, by offering a better price/performance advantage. In fact, multicore CPUs and many-core GPUs can be deployed and used to perform network coding simultaneously, potentially useful in media streaming servers where hundreds of peers are served concurrently by these dedicated servers. In this paper, we present Nuclei, the design and implementation of GPU-based network coding. With Nuclei, only one mainstream NVidia 8800 GT GPU outperforms an 8-core Intel Xeon server in most test cases. A combined CPU-GPU encoding scenario achieves coding rates of up to 116 MB/second for a variety of coding settings, which is sufficient to saturate a Gigabit Ethernet interface. Index Terms—Network coding, Many-core GPU computing. I
- …